National Repository of Grey Literature 5 records found  Search took 0.01 seconds. 
Extension of Apache Tika with Industrial File Formats Text Extraction
Rešetár, René ; Burget, Radek (referee) ; Rychlý, Marek (advisor)
The goal of the bachelor's thesis was to extend the parsers of the Apache Tika project with data and table extraction from industrial document formats from laboratory instruments. These data will be stored in a structured format according to a certain scheme. In the theoretical part, the supplied industrial formats, the Apache Tika project and the possibilities of its expansion were examined. In the practical part, a tool was designed and implemented, which classifies documents using the Apache Tika project, processes them, creates structured data from them in the JSON format and subsequently validates them. Finally, a set of tests was created to verify and demonstrate the properties of the solution.
Automatic Test Input Generation for Information Systems
Naňo, Andrej ; Fiedor, Tomáš (referee) ; Smrčka, Aleš (advisor)
ISAGENis a tool for the automatic generation of structurally complex test inputs that imitate real communication in the context of modern information systems . Complex, typically tree-structured data currently represents the standard means of transmitting information between nodes in distributed information systems. Automatic generator ISAGENis founded on the methodology of data-driven testing and uses concrete data from the production environment as the primary characteristic and specification that guides the generation of new similar data for test cases satisfying given combinatorial adequacy criteria. The main contribution of this thesis is a comprehensive proposal of automated data generation techniques together with an implementation, which demonstrates their usage. The created solution enables testers to create more relevant testing data, representing production-like communication in information systems.
Automatic Test Input Generation for Information Systems
Naňo, Andrej ; Fiedor, Tomáš (referee) ; Smrčka, Aleš (advisor)
ISAGENis a tool for the automatic generation of structurally complex test inputs that imitate real communication in the context of modern information systems . Complex, typically tree-structured data currently represents the standard means of transmitting information between nodes in distributed information systems. Automatic generator ISAGENis founded on the methodology of data-driven testing and uses concrete data from the production environment as the primary characteristic and specification that guides the generation of new similar data for test cases satisfying given combinatorial adequacy criteria. The main contribution of this thesis is a comprehensive proposal of automated data generation techniques together with an implementation, which demonstrates their usage. The created solution enables testers to create more relevant testing data, representing production-like communication in information systems.
Extension of Apache Tika with Industrial File Formats Text Extraction
Rešetár, René ; Burget, Radek (referee) ; Rychlý, Marek (advisor)
The goal of the bachelor's thesis was to extend the parsers of the Apache Tika project with data and table extraction from industrial document formats from laboratory instruments. These data will be stored in a structured format according to a certain scheme. In the theoretical part, the supplied industrial formats, the Apache Tika project and the possibilities of its expansion were examined. In the practical part, a tool was designed and implemented, which classifies documents using the Apache Tika project, processes them, creates structured data from them in the JSON format and subsequently validates them. Finally, a set of tests was created to verify and demonstrate the properties of the solution.
ScraperWiki Tutorial
Levine, Thomas
The objective of the workshop, or better hackathon, was to get the data into a structured format, and join it with data from another sources – together with an overview and showing by example what is possible with scraping. Thomas identified targets for web scraping and navigating the complexity of different types of web pages and introduced that in a few half-hour-long and hour-long modules that catered to different audiences.
Slides: Download fulltextPDF

Interested in being notified about new results for this query?
Subscribe to the RSS feed.